Skip to content

Conversation

@derrickaw
Copy link
Contributor

@derrickaw derrickaw commented Oct 8, 2025

Description

Rerunning these commands failed due to untagging on Docker Hub (https://hub.docker.com/r/bitnami/kafka) for bitnami images. They have been moved to bitnamilegacy repo and no new tagging has been completed yet. Will just use Apache instead.

Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.

Checklist

  • I have followed Sample Format Guide
  • pom.xml parent set to latest shared-configuration
  • Appropriate changes to README are included in PR
  • These samples need a new API enabled in testing projects to pass (let us know which ones)
  • These samples need a new/updated env vars in testing projects set to pass (let us know which ones)
  • Tests pass: mvn clean verify required
  • Lint passes: mvn -P lint checkstyle:check required
  • Static Analysis: mvn -P lint clean compile pmd:cpd-check spotbugs:check advisory only
  • This sample adds a new sample directory, and I updated the CODEOWNERS file with the codeowners for this sample
  • This sample adds a new Product API, and I updated the Blunderbuss issue/PR auto-assigner with the codeowners for this sample
  • Please merge this PR for me once it is approved

@product-auto-label product-auto-label bot added samples Issues that are directly related to samples. api: dataflow Issues related to the Dataflow API. labels Oct 8, 2025
@derrickaw derrickaw marked this pull request as ready for review October 8, 2025 18:54
@derrickaw derrickaw requested review from a team and yoshi-approver as code owners October 8, 2025 18:54
Copy link
Contributor

@iennae iennae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for proposing this change. It doesn't look like bitnami has a secure kafka image offered in their new bitnamisecure so this looks like the right path forward to use the official Apache release.

I've got two main points of feedback, with the second being critical for ensuring the sample works as intended.

Image Transition & Configuration
I totally understand the Kafka setup is secondary to the Dataflow portion, but I'm wondering how much you've confirmed changes based on swapping out a "polished" image (Bitnami) for an "upstream" one (Apache):

Documentation Context: Could you please add a brief note to the README explaining why we are moving from bitnami/kafka to apache/kafka? This helps future contributors (including sample reviewers) and users understand the dependency choice.

Default Configuration: The Bitnami image often included built-in security defaults and environment variables. Have you confirmed that the basic apache/kafka image provides a drop-in functional equivalent for this simple setup, or if any additional variables are needed to maintain security/stability? We might want to add a note at the top around this sample showcasing how to do this for local development and not showing production-quality kafka configuration.

I believe that apache/kafka:latest currently points to the 4.x.x stream https://hub.docker.com/r/apache/kafka which runs Kafka without ZooKeeper. The tutorial documentation still contains steps and infrastructure setup (firewall rule for port 2181) that assume a ZooKeeper-based Kafka instance (which the original bitnami/kafka:3.4.0 was).If we use $4.x$ while retaining the ZooKeeper setup instructions, the commands in the "Sending messages to Kafka server" section will likely fail, blocking the user from running the Dataflow pipeline.

I think this means either refactoring the instructions or pin the image to a specific, recent $3.x.x$ version that is confirmed to be ZooKeeper-compatible and serves as a direct functional replacement. In general, it's better to pin to a specific version in samples to ensure that users can replicate a known good experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: dataflow Issues related to the Dataflow API. samples Issues that are directly related to samples.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants